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Abstract 

A novel algorithm was recently presented to utilize emerging time dependent 
probability density data to extract molecular potential energy surfaces. This 
paper builds on the previous work and seeks to enhance the capabilities of the 
extraction algorithm: An improved method of removing the generally ill-posed 
nature of the inverse problem is introduced via an extended Tikhonov reg- 
ularization and methods for choosing the optimal regularization parameters 
are discussed. Several ways to incorporate multiple data sets are investigated, 
including the means to optimally combine data from many experiments ex- 
ploring different portions of the potential. In addition, results are presented 
on the stability of the inversion procedure, including the optimal combina- 
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tion scheme, under the influence of data noise. The method is applied to the 
simulated inversion of a double well system to illustrate the various points. 



Typeset using REVTgX 



I. INTRODUCTION 



To fully understand chemical dynamics phenomena it is necessary to know the underly- 
ing potential energy surfaces (PES) Surfaces can be obtained by two means: ab initio 
calculations and the inversion of suitable laboratory data This paper is con- 

cerned with an emerging class of laboratory data [fT~5| P~Tf| with special features for inversion 
purposes. Traditional sources of laboratory data for inversion produce an indirect route 
to the potential requiring the solution of Schrodinger's equation in the process. An 



alternative suggestion [0,^0] has been put forth to utilize ultrafast probability density data 



from diffraction observations or other means |2lH26j to extract adiabatic potential surfaces. 
Such data consists of the absolute square of the wavefunction. Although the phase of the 
overall wavefunction is not available, there is sufficient information in this data to extract 
the potential fully quantum mechanically without the solution of Schrodinger's equation. 
Instead, the proposed procedure rigorously reformulates the inversion algorithm as a linear 
integral equation utilizing Ehrenfest's theorem [^7] for the position operator. Additional 
attractive features of this algorithm are (a) the procedure may be operated non-iteratively, 
(b) no knowledge is required of the molecular excitation process leading to the data and (c) 
the regions where the potential may be reliably extracted are automatically revealed by the 
data. 

Extensive efforts are under way to achieve the necessary temporal and spatial resolution 
of the probability density data necessary for inversion processes as well as for other applica- 
tions [p0|1 . In anticipation of these developments a number of algorithmic challenges require 
attention to provide the means to invert such data. This paper aims to build on the previous 



work [19|] and address some of these needs. In particular this paper will consider (i) optimal 
choices for regularizing the inversion procedure, (ii) incorporation of multiple data sets and 
(hi) inclusion of data sampled at discrete time intervals. These concepts are developed and 
illustrated for the simulated inversion of a double well potential. 

The paper is organized as follows. The basic inversion procedure and the model system 
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are given in Section ||. Based on the inversion algorithm derived in Ref. |19j an extended 
regularization procedure is presented in Section |TTT| followed by a discussion of a modified 
time integration scheme applicable to different types of experimental data sampling. This 
development naturally leads to consideration of an optimal combination of data from differ- 
ent measurements. A proof on how to optimally combine the data is given in Appendix |A|. 
The stability of this data combination procedure under the influence of noise is discussed as 
well. Section [V] summarizes the findings of this paper. 



II. THE BASIC INVERSION PROCEDURE AND THE MODEL SYSTEM 



The algorithms developed in this paper will be illustrated for a one-dimensional system 



but the generalization to higher dimensions is straightforward [pq] : the major difference with 
higher dimensions is the additional computational effort involved. Atomic units are used 
throughout this work. 

For a system whose dynamics is governed by the Schrodinger equation 

i d 2 



+ V(x) 



ip(x,t) 



2m dx 2 

the time evolution of the average position obeys Ehrenfest's theorem 



(1) 



^ = m ~dt 2 l x P( x 't)d-X+ J u(x) p(x,t) dx 



(2) 



where u(x) = dV(x)/dx and p(x,t) = \ip(x, t)\ 2 . In this work the probability density p(x,t) 
is assumed to be observed in the laboratory and the goal is to determine the potential energy 
surface (PES) V(x) from the gradient u(x). 

Following JTPJ, Eq.(^) can be used to construct a Gaussian least squares minimization 
problem to determine the PES gradient u(x) 



J {u(x)} = - 



u(x) p(x, t) dx + m ~^2 l x P( x ^t)dx 



dt 



(3) 



The time averaging acts as a filtering process to increase inversion reliability by gathering 
together more data. This will generally increase reliability which in principle is only limited 



by the exploratory ability of the wavepacket. Beyond some point in time little information 
on the potential may be gained by taking further temporal data starting from any potential 
initial condition. 

Variation with respect to u(x) results in a Fredholm integral equation of the first kind 
SJ {u(x)} 



5u(x) 

with righthand side (RHS) 



=> / A(x',x)u(x')dx' = b(x) (4) 



T 

Tti f d^ 

b{x) = -- Jp(x,t)-^ I x'p(x',t)dx'dt (5) 





and symmetric, positive semidefmite kernel 



T 
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A{x',x) = - j p(x',t)p(x,t)dt . (6) 







Treated as an inverse problem, Eq.([|) produces the desired PES gradient u(x) as its solution. 
For numerical implementation we resort to the matrix version and its formal solution 

A-uAx = b u = A' 1 ■ b Ax -1 ; . (7) 

Here the integral in Eq.(f|) is evaluated at points of equal spacing Ax. 

This approach to seeking the PES has a number of attractive features ||19|| . The formu- 



lation requires no knowledge of any preparatory steps to produce a specific ip(x, 0) which 
evolves freely to produce p(x, t). The generation of A(x, x') and b(x) depends only on p(x, t) 
and begins when the observation process is started. Moreover, although this is a fully quan- 
tum mechanical treatment there is no need to solve Schrodinger's equation to extract the 
PES. The dominant entries of A(x,x r ) and b(x) automatically reveal the portions of the 
PES that may be reliably extracted. The linear nature of Eq.(^j) is very attractive from a 
practical perspective. 

Notwithstanding these attractions, a principal problem to manage is the generally sin- 
gular nature of the kernel of the integral equation in Eq.(Mj). The kernel's nullspace makes 



it difficult to solve the inverse problem and leads to an unstable and ambiguous solution 
u(x), two characteristics that generally define the ill-posedness of inverse problems. There 
are two major reasons for the ill-posedness of the inverse problem in Eqs. (Q) and ([7|). 
Firstly, it is not possible to continuously monitor the wavepacket with arbitrary accuracy 
and information is lost due to discrete data sampling in space and time. Secondly, the 
ill-posedness is due to the wavepacket only exploring a subspace of the PES. In regions 
untouched by the wavepacket with p(x, t) ~ for all observation times t the kernel entries 
vanish as A(x, x') = A(x' , x) = ^ p(x, t)p(x' , t) dt ~ 0. Hence these regions correspond to 
zero-entry rows and columns in the kernel matrix A and constitute its nontrivial nullspace. 
In general, the solution u(x) will only be reliable in regions where p(x, t) has significant 
magnitude during its evolution. The inversion procedure can manage the null space with 
the help of a suitable regularization procedure. Singular value decomposition and iterative 
solution schemes are available (cf. for an overview), but here we will employ extended 

Tikhonov regularization (see Section |T|). 

The procedures developed in this paper are applied to a simulated inversion with a system 
taken to have a slightly asymmetric double well potential [Ml 



V{x) = A( x _ qo ) + V A/2 ( x _ qo )\ x + qo f + A (8) 

with parameters 

g = 1.0 (9) 

A = 0.000 257 (asymmetry) (10) 

^ = 0.006 25 (barrier height) . (11) 



In the work of N. Doslic et al. [|3T| this PES represents a one dimensional model for the 
intramolecular proton transfer in substituted malonaldehyde (see Fig. |1|). The particle mass 
is accordingly that of hydrogen. 

The wavepacket propagations to obtain the simulated p(x, t) data employed the split 
operator method (cf. |32]j33[1 ). For propagation as well as inversion we used a grid with 



8192 points over the range —4.0 ^ x ^ 4.0. A time step At prop = 3 was chosen and 
total propagation time was T = 1200. The small values of At prop and Ax prop ensured good 
convergence of the numerical propagation procedure. 

The initial wavefunctions were normalized Gaussian wavepackets of width a = 0.05. 
As stated earlier, the inversion algorithm requires no knowledge of how these packets were 
formed, but generally one may assume that a suitable external laser field was applied for 
times t < 0. The initial packets were placed at the left (L) and right minimum (R) of the 
PES, on top of the barrier (T), and at a location high on the potential (H). The wavepacket 
positions are illustrated in Fig. [l| and their exact values, the associated average energies 
and the classical turning points at these energies are given table |. The inversion process 
employed a time step and grid spacing that differed from those used in the propagation, 
as high spatial and temporal resolution is difficult to attain in the laboratory. Hence, we 
employed only a portion of all the available propagation data p(x, t) in time and space. We 
will present inversion results using every 16th propagation grid point (i.e., Aa; = 16- Ax prop ) 
and every fifth available snapshot (i.e., At = 5 • At prop ); even fewer snapshots could be used 
over a longer period of time with the criterion that roughly the same total amount of data 
is retained. The inversion results from these lower resolution data are very encouraging. 

The kernel matrices A for condition H and T are shown in Fig. |2|; similar plots apply to 
the cases L and R. The kernels are symmetric with respect to x = x' and their values cover 
a large dynamic range from ~ 10 3 down to 10~ 8 on the plotted domain. Significant entries 
are found predominantly on the matrix diagonal, close to the origin of the wavepacket, and 
also in the vicinity of the classical turning points. Beyond the classical turning points at a 
distance of approximately ±2.0 the kernel values fall off very rapidly for both configurations. 

For configuration H in Fig. 0a the initial narrow gaussian is peaked at the hydrogen dis- 
tance Xo = 1.75 with corresponding large entries around (x, x') ~ (2, 2). The wavepacket starts 
to spread and acquires momentum as it slides down the PES, which results in the broadening 
diagonal trace observed as the central structure in Fig. 0a. When the wavepacket reaches 
its lefthand turning point it spreads further (star structure around (x, x') ~ (— 1.75, —1.75)) 
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before it returns. This pattern coincides with the motion of the average position (x(t)) 
displayed for configuration H in Fig. ^a. 

Even higher symmetry can be observed for configuration T's kernel matrix A in Fig. 0b. 
The initial gaussian remains centered around x$ = 0.0052 and spreads to the left and right- 
hand well only. This is further supported by the motionless average position (x(t)) in Fig. |]a. 
Hence large entries in A result in the vicinity of (x, x') = (xq, xq) and the wavepacket's sym- 
metrical spread to the left and righthand side of the PES produces the spikes along the 
x-axis for x' = 0. Due the kernel's symmetry these spikes reappear as lines along the x'-axis 
for x — 0. Large contributions for x = x' will again lead to a pronouced diagonal and add to 
the snowflake appearance of Fig. 0b. 

The features of the kernels in Fig. |2| coincide with the nature of the inverse problem 
mentioned earlier: symmetry, ill-posedness, and automatic identification of the range where 
the PES may be be reliably extractable (i.e., where the kernel entries are large). For con- 
figuration H the relevant range is —2 < x < 2 and for configuration T only the vicinity 
of the barrier top should yield reliable PES information. In both cases we cannot expect 
reasonable solutions beyond ±2.0, which coincides with the classical turning points given in 
table |. 



III. AN IMPROVED REGULARIZATION PROCEDURE 



Tikhonov regularization [|34]] is straightforward to implement with simple control provided 
by suitable weight parameters. It provides a well defined means to stabilize the inversion 
and extract reliable PES information in those regions allowed by the data. 



This investigation goes beyond the initial work [19| to carefully explore various regu- 
larization options. Regularization has the goal of improving the accuracy of the solution, 
assuring stability and ease of use including computational simplicity. The functional Jo was 
augmented by a regularization term involving a set of increasingly higher order differential 
operators acting on u(x) 
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dx , (12) 



■h{u{x)} = J {u(x)} + J^cvT 1 / Kj - ) 

i/=o J l\ / 

with real coefficients a v > and a reference length £. In practice £ may be thought of as 
the spatial resolution of the data and in the present numerical simulation it was taken as 
Ax. For a multidimensional system, £ and a v will become direction dependent tensors. The 
parameter £ acts to ensure that all the new terms added to Jo have the same units as [u] 2 
as well as permits comparison of the roles of the dimensionless regularization parameters a v 
for different v and different grid spacings Ax. 

The previous work |IIJ did not employ a reference length as only the v = regularization 
term was considered. The parameter ao penalizes the value of u(x). The new terms go 
beyond and impose extra pressure on the gradient (v — 1), the curvature [y = 2) of u(x), 
etc. . 

Variation of J\ with respect to u(x) yields the modified inversion prescription 

u{x')dx' = b{x) . (13) 

The sum added to J in Eq.(|T2|) for regularization consisted of purely positive terms with 
derivatives of up to iVth order, resulting in an alternating series of only even derivatives up 
to order 2N in Eq.flTB]). Moreover, the Fredholm integral equation of the first kind has been 
transformed into an integro-differential equation for u(x) with the added terms dominating 
in the regions where kernel is singular. 

Due to the rapid growth in the order of the derivatives it is often sufficient to set N = 2, 
i.e., retaining standard, gradient, and curvature Tikhonov regularization. For numerical 
application Eq. (|l3"l) may be transformed into the matrix problem 

[A + a Ax~ 2 I - ax D + a 2 (Ax) 2 Q] ■ u Ax = b , (14) 

employing the unit matrix 11, as well as the second 



J A(x',x) + 6(x-x').f2^r 1 (-e^y 
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/-2 1 
1 -2 1 



D 



(Ax) 



\ 



1 -2 
1 
and the forth order differentiation band matrices 

/ 6 -4 1 



1 

- 2 J 



(15) 



(Ax) 



-4 6-4 1 
1-4 6-4 1 
1-4 6-4 1 



V 



(16) 



/ 



These are simple differencing expressions for the derivatives involved. Higher order expres- 
sions for the derivatives could be considered, but finite data resolution and laboratory noise 
will generally not warrant or support the added complexity. 

To investigate the inverse solution's dependence on the various regularization parameters 



in Eq.(14) several parameter scans for all four configurations L, T, R, H were performed for 
different resolutions Ax and combinations of a^-parameters. For the discussion in this paper, 
we selected typical results for the situation of H with Ax = 16Ax prop . The curves in Fig. |3| 
show the solution defect \Au\ and the system defect |As| as defined below in Eqs. (|TTD and 
(p0|) . While only |As| is an experimentally accessible figure of merit, an investigation of 
| Am | here allows for quantifying the quality of the inverse solution. For both error measures 
reported the plots are generated for each a v independently while the others are kept zero. 
Figures |]a and ||b display the solution defect 



|Au| 



1 



■>-b 



Xf) x a 



(Mexact(x) - u(x) f dx 



1/2 



(17) 



Figure |^a is computed with x a = —2.0, Xb = 2.0 (i.e. the central domain indicated in 
Fig. |2| and table H within which the inversion is expected to be valid) and Fig. [||b with 
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x a = —4.0, Xb = 4.0 (i.e., the full simulation range). The differences between the two cases 
are striking. The corresponding solution defects show a completely different shape with 
minima that differ by several orders of magnitude in a v . In Fig. |3|b the magnitude of the 
error in the active domain —2 < x < 2 is overestimated. This behavior in Fig. [5]b is due 
to large deviations between the exact gradient and the inversion solution for the gradient, 
which cannot be recovered reliably in the domain's outer limits. Thus we conclude that \Au\ 
scans should only be computed over the regions actually reached to a significant degree by 
the wavepacket (cf., Fig. |||a) to achieve reliable estimates of the inversion quality. 

The latter point is illustrated in Figs. |]b and ^c with the inverted results for u(x) ane 
V(x) with pure ot\ regularization of configurations H/Hi where a\ is given in table |I[ The 
two cases H/Hi differ in the domain employed in the inversion (i.e., the active domain for 
H and the full domain for Hi) and in the choice of optimal ot\ determined according to the 
| Am | scans. Thus we further conclude that the inversion process should be confined to the 
active domain to maintain stability. 

To find suitable integration regions from the laboratory data the normalized lefthand 

(x) , (x) 



aj(t) = I (./■ - (,r))-f)(.r. i)d.r / / 2f>(,r. t ) d.r (.18) 
and righthand variance 



a;(t) = J (x - (x)Yp(x, t) dx / j 2p(x,t)dx (19) 

(*> ' (») 

of the position operator can be helpful. Together with the position average (x) they can 
provide an estimate for the PES domain predominantly covered by the wavepacket motion. 
We present all three quantities ({x(t)) and o-p{t),a T {t) as grey shaded regions) in Fig. £|a. 
The results clearly show that for configuration H the range —2 < x < 2 is suitable. For 
configurations L, T, R an even smaller range is best (cf., table H). 

All the computations revealed that a gradient Tikhonov regularization based on oti per- 



forms better than the standard regularization based on ao utilized earlier fl9| . There is 
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some additional improvement in choosing the curvature regularization «2, but we found it 
to be less stable for coarse grids, which will be the standard situation in actual application. 

We also found little improvement in mixing the different regularization schemes. In 
general the a v regularization with the largest errors masks the positive effects of the others. 
Hence for all cases of the PES reconstruction we utilized only ot\ regularization (cf., the 
inversion in Figs. |]a and [|b with the optimal parameters given in table 0). 

As a measure of inversion quality and the role of regularization, we desire a quantity 
that is strictly available from the laboratory data p(x, t). A good choice is the system defect 
| As | defined by the norm of satisfying the system equation @ with the inverse solution u(x) 



The values of | As\ will depend on the regularization parameters a v . Weak regularization will 
produce a small value of |As|, but likely artificial structures in the PES. Over regularization 
will result in a smooth PES, that is systematically in error with diminished influence from 
the kernel A(x,x') on the inverse solution. The best choice for the a u is generally where 
| As | has risen and leveled off in a stable region as shown in Fig. |3]c. The figure shows that 
| As | naturally tends to zero as a v — > 0+ and monotonically rises until it reaches a plateau. 
There is very good agreement between the values of a v which show good results for |Au| 
in Fig. |3]a and the stable regularization region identified in Fig. |3|c. Thus |As| should be of 
practical utility in assigning regularization parameter values. 

The generally self-similar structures in Figs. 0a and 0c suggest that every regularization 
operator has a roughly similar effect. This added robustness is also attractive for practical 
application if it holds up regardless of the system. 



found via Eq. ( pT3[) 




(20) 
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IV. COMBINING DISTINCT SETS OF LABORATORY DATA 



Sections [IV A and [IV B| will cover different approaches to combining distinct sets of 



laboratory density data. Finally Section [IV C| will explore the impact of data noise on the 



inversion. 



A. Optimal combination of experimental data 

The functional Jq{u(x)} in its original form in Eq.(|3]) is expressed in terms of a uniform, 
continuous time integration of observed p(x, t) data. However, experimental circumstances 
including measurements at discrete snapshots in time or changes in the quality of data 
sampling may necessitate employing a weight function u(t) for a generalized approach to 
the time integration in the functional Jo- Thus we define Jo a s 



J {u(x)} 



oo 

d 



2 n l2 



u(x) p(x, t) dx + m—— xp(x,t)dx 
at 2 



u(t) dt . (21) 



o 



The choice u(t) = [©(£) — 0(i — T) ]/T, with being the Heaviside step function, will 
reduce J to J . 

Variation of Eq. (|2T|) leads to a modified inverse problem 

5J {u(x)} 



/ A(x',x)u(x')dx' = b(x) , (22) 



5u(x) 

with the new kernel 



A(x',x) = J p(x',t)p(x,t)uj(t)dt (23) 
o 

and RHS 

T 

b{x) = -rn J u(t) p(x, t) fx' p(x', t) dx' dt . (24) 



The weight u(t) does not alter the regularization terms in Eq. (|T3"D . If b(x) is rewritten using 
partial integration over time, then the weight function must be considered in this process. 
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The above equations were applied to two generic cases. First, we considered data gath- 
ered as snapshots in time i.e., u (t) = J2 j= i - 1), and evaluated Eqs. fl2|) and (J2|) with 
this weight. This procedure simply reduced all time integrations to sums over the sampled 
p data. Next, we considered the case in which the measurement process has been divided 
into two continuous time intervals of length T\ and T 2 separated by a period of time r. A 
reasonable choice of weights would either be 

e(t)-e(t-Tx) Qjt-T-T^-Qit-T-n-n) 
^ = j-yf 2 + j-yf 2 • (25) 

or 

= eg) - eg - Ti) + eit-T-T^-ojt-T-n-n) _ 

Ti T 2 

The choice depends on the desired emphasis to be given to the two data intervals. Here 
we chose to give the longer interval a larger contribution in A(x, x') than the shorter one, 
and this can be better achieved with using Eq. (|25|) ; this choice is reasonable, provided the 
measured data p(x, t) in both intervals are of comparable quality. Clearly many other issues 
can be incorporated into the choice of u(t) dictated by what is known about the nature of 
the data and the information sought about the PES. 
The kernel is now 

(Ti Ti+r+T 2 \ 
j+ j \ p(x',t)p(x,t)dt (27) 
Tl+t / 

and the RHS reads 

J \ J p(x, t) ^ Jx' p(x', t) dx' dt . (28) 

Ti+t / 

The interpretation of the weight in Eq.(^) is associated with performance of the inversion 
with an interrupted gathering of data from a single experiment. To explore this point further 
it is useful to rewrite Eqs. (127]) and (|2Sj) as 



Ai(x, x 1 ) + A 2 (x, x) } u(x') dx' = bi(x) + b 2 (x) , (29) 
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where the indices "1" and "2" denote the evident two data time domains. In this form 
the gathering of data from one interrupted experiment can also be interpreted as finding 
the simultaneous solution to the inverse problem of two different experiments. These two 
experiments could possibly be prepared with distinct controls could, for example, explore 
different regions of the PES. 

We found that it is optimal to simply combine these sets of data by addition as indicated 
in Eq . (|29|) . This procedure will yield an inverse solution uq{x) with accuracy greater than a 
linear combination u(x) = jiui{x) + vu 2 (x) of separate solutions to the individual problems 
"1" and "2" as explained below. 

Consider two experiments that yield two different inverse solutions satisfying their re- 
spective system equation 

J Ai i2 {x, x')ui ;2 (x') dx' = b 1)2 (x) . (30) 

Naturally there should be only a unique exact u ex ,(x) for the physical system. Hence both 
system solutions u± t2 in Eq.(|30"D can be decomposed into the exact solution and contamina- 
tion pieces from the kernel's nullspace 

ui, 2 (x) = Mex.(x) + a lj2 (x) + r h2 (x) . (31) 

The functions a\ j2 and ri j2 are associated with the nullspace of the two kernels with 0.1.2(0;) G 
ker(Ai) n ker(A 2 ) being the contamination from the common nullspace of Ai and A 2 and 
r 12 (x) the residual contribution unique to the respective kernel. The goal is to use the data 
to find an optimal solution Uq(x) with the smallest possible nullspace contribution. 

Exploiting the linearity of the inverse problem, we may add the two pieces of Eq. ( P0"D to 

get 

J Ai(x, x')ui{x') dx' + J A 2 (x, x')u 2 {x) dx' = b\{x) + b 2 {x) . (32) 

This doesn't fully satisfy Eq.Q2Tf) and it is in general not possible to construct the optimal 
solution uq(x) as a linear combination Wo^) — ^Ui(x) + vu 2 [x) with constant coefficients 
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fj,,v. To elucidate this point, we insert Uo(x) into Eg. (p9|) and with the help of Eqs.(pO|) 
and fl31| ) we get the cross terms 

y A 2 (x, x')ui(x') dx' = b 2 {x) + J A 2 {x,x')ri{x') dx' = b 2 {x) + 2 e\{x) , (33) 

where the prefactors /i, 1/ have been omitted. Hence uq(x) is not an optimal solution of 
Eq . (^9l) since it leaves errors i£j{x) that cannot be eliminated. However, by employing 
Eq. ( p9|) and adding the kernels and RHSs we can improve the quality of the inversion. 
No error terms like iSj{x) will appear since by construction the resulting Uq(x) can be 
decomposed as Uq{x) = u(x) + a (x). A contribution from r (x) as in Eq.(|3~I|) will not arise, 
as proved in Appendix Thus, the solution of the combined problem will gain in quality 
by virtue of the reduced nullspace of the new kernel A% + A%. 

These optimality results are rigorous but it must be added that in general any combina- 
tion of a finite amount of data will not fully eliminate the nullspace. However in the cases 
under comparison here the assumption that a similar degree of robustness can be attained 
certainly holds true. 

As argued above, we chose the weighting function in Eq. fl25|) to result in observation- 
duration proportional entries in Ai(x, x') and A 2 (x, x'). Hence it is quite natural to add A = 
A\ + A 2 . However, choosing the approach Eq.(p6|) normalizes each data set independently. 
This logic naturally leads to considering the optimal combination of data to form A = 
oA\ + 5A 2 where a and 5 are positive constants. This specially weighted form, or a positive 
definite combination A = (1 — fi)Ai + (3A 2 with j3 G (0, 1), might be useful especially in the 
presence of different degrees of noise in the two data sets. An iterative numerical scheme to 
optimize (3 could then help to improve the solution by minimizing the effects of nullspace 
contamination. 

The optimal combination of data by addition of kernels Ai(x,x') and RHSs b{(x) pre- 
sented above was applied to the double well system with results for the gradient u(x) and 
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PES V(x) shown in Fig. [|. Information was successively added to the kernel A(x,x') by 
combining the data sets to form LT, LTR, and LTRH with the notation based on the initial 
conditions shown in Fig. [I]. In each case all configurations are weighted equally. The optimal 
a± values employed and defect measures are given in table [TT[ 

While the individual inverse problem solutions based on L, T, R, and H reproduce the 
potential in their respective neighborhoods quite well, they fail to give adequate results 
for the other portions of the potential. On the other hand, the reconstruction of large 
parts of the PES is successful if we optimally combine the data of the three experiments 
LTR. However, contrary to intuition, we observe that the solution is less satisfactory from 
combining all the data LTRH; some additional oscillations appear along with a dip in the 
vicinity of the initial wavepacket for H. Apparently the nullspace of the expanded domain 
cannot be fully managed by «i regularization alone; no attempt was made to simultaneously 
introduce «o and cti regularization. 



B. Other combinations of data 

Several other schemes for combining the raw density data can be envisioned, apart from 
the approach in Section IV A. One candidate would be the direct combination of p(x, t) 



data from different experiments. As an illustration we will treat the case of two different p's 
with 

p(x, t) = pi (x, t) + ep 2 {x, t) (34) 

and e being a positive constant. This combination is physically acceptable, as Ehrenfest's 
theorem in Eq. (0) is linear in the probability density. Insertion of this sum into the functional 
Jq{u(x)} and variation with respect to u(x) will yield a formulation analogous to the one 
describing inversion under the influence of noise in the data (see Section |1V (J| ) in Eq.(p8|) 
upon comparison of Eqs.(|36[) and 



The terms proportional to e° and e 2 will exactly correspond to what was found earlier 
in Eq . fl29|) . However, the terms proportional to e represent a cross correlation between 
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pi and pi. These cross terms can be significant, and they act to introduce an element 
of undesirable structure, often oscillatory, in the equations determining u(x). On physical 
grounds it is also artificial to directly correlate the independent experimental data p\ and 
P2 when seeking u(x). 

Hence, the scheme of adding together the bare p-data is expected to produce unreliable 
results. To support this argument we present a test on such a p-combination consisting of 
the sum of all four densities of the initial configurations L, T, R, and H 

p s (x, t) = p L (x, t) + p T (x, t) + p R (x, t) + p H (x, t) . (35) 

The corresponding inverted gradient and PES respectively are shown in Figs. |^a and ^b. 
The solution is rather poor and far worse than the LTRH combination using the same data. 
This result should not be taken to construe that other combinations of data might not give 
satisfactory results. However, the combination of and hi in Section |1V A| is quite natural 
and produces excellent inversion results. 



C. The influence of noise on the inversion 

Any real p-data will always be contaminated by some degree of noise. In an additive 
model this noise contaminated data p n (x, t) can be represented as 

p n (x, t) = p(x, t) + ej(x, t) , (36) 

where e > is a ordering parameter and the noise is described by the spatio-temporal 
function j(x,t). We assume that j(x,t) is a randomly varying function with vanishing 
average contribution and free from systematic error such that 

T 

^Jl(x,t)a(x,t)dt (37) 

o 

for any function a(x,t) of bounded norm over time that is not correlated with j(x, t). 

Inserting the ansatz in Eq.(|3"6]) into the functional Jq{u(x)} in Eq.(^) and taking the 
first variation, the equation determining u(x) is obtained 
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T T 

i // p(,'. *) dt „ M d*' + £ // tW ,, () d* d*' 



T T 

+ J l(rf,t)pfat) dtu{x') dx' + ^J J "f(x',t)"f(x,t) dtu(x') dx' 



T T 
771 f d 2 f 777 f d. 2 f 

= -— \ p(x,t) — / x'p(x',t)dx' dt-e— J p(x,t) — / x'j{x',t)dx' dt 


T T 

r ™ r a2 



—e- 



777 f d f 777 f d f 

T I 7(X ' ^dPJ X ' P ^'' f) dX ' dt ~ J 7(X ' ^ dP J X '^ X '' ^ dX ' dt ' (38) 
o o 

The terms proportional to e° recover the original unperturbed system in Eqs.([|-[|). Assuming 

the data noise level to be small, the terms in e 2 on both sides of Eq.(^) can be neglected. 

We first turn to the kernel side of Eq.(|38f) and denote all terms in e 1 as the error kernel 

5A(x,x') 

T T 

5A{x, x') = | J 7 (x\ t)p{x, t)dt+^ J p{x', t) 7 (x, t) dt . (39) 



Each term involves the computation of two-point spatial correlations between functions. 
However, the functions 7 and p are uncorrelated, and the temporal integral of their product 
is expected to result in only small random contributions to the kernel over x and x', especially 



for longer time integration as follows from Eq. (57). Following similar logic, the terms 
proportional to e 1 on the RHS of Eq.fl38|) should be negligible, especially for long time 
integration. Neglecting the e 2 terms finally leaves only the first term proportional to e° on 
the RHS. 

Hence, the functional Jo exhibits some inherent capability to deal with slightly noisy 
data. The time integration process averages out these noise effects so that they should have 
a decreasing impact on the inverse solution u(x). Longer periods of temporal data should 
make their behavior better. 

These results are also in accordance with the stability analysis presented in . Resort- 
ing to the matrix version of the inverse problem (cf., Eq.(|14])) the authors proved (Eq.(25) 
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in Ref. fT9| ) that the relative error in the solution u after regularization is bounded by the 
relative errors in the data Sb and S A. 

Moreover it was found (Eqs.(41) and (49) |]19[) that small perturbations in the noise ej 
will result in small proportional perturbations in b and A, which is excellent behavior for 
any application with finite time integration. These results can now be extended to the long 
time integration limit where the e 1 terms in Eq. (|39|) should further diminish in significance 
for T — > oo. Similar arguments apply to the RHS b ||35|| . 

Equation ( j39|) also demonstrates why the direct combination of bare p(x, t) data dis- 



cussed in Section [IV B| performs less satisfactory than the optimal combination scheme in 



Section |IV A| . In contrast to the slightly perturbed system cross term SA(x } x') above, the 
analogous term arising from directly combining the p data will not vanish. This will in- 
troduce an undesirable error contribution to the inverse problem. In contrast, the optimal 
combination scheme for different sets of data in Section [IV A| should profit from the inherent 
stability of the inversion procedure to deal with slightly noisy systems since this technique 
involves a sequence of separate time integrations. 

V. SUMMARY AND OUTLOOK 

This paper presented new results that improve and extend a recently suggested pro- 
cedure [[L9|] to extract potential energy surfaces (PES) from the emerging experimentally 



observable probability density |-^(x,t)| 2 data. The results of this paper should also be 
applicable to the more general case of extracting the dipole function from the additional 



observation of the applied laser electric field p0[ . 



An easy to implement regularization scheme was introduced, which increases the ac- 
curacy of the computed PES without loss of numerical stability. Furthermore an optimal 
reconstruction method was presented which combines data from different measurements. 
This scheme was argued to be optimal in the sense of reducing the nullspace of the inverse 
problem and hence increasing the domain of the extracted PES. Evidence was presented that 
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this scheme is stable under the influence of noise, but further investigations will be necessary 
to fully confirm these results. We hope that the developments in this paper stimulate the 
generation of appropriate probability density data for inversion implementation. 
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APPENDIX A: OPTIMALITY PROOF 

This section presents the lemma and its proof underlying the optimal combination of 
data from different measurements. 

Lemma 1 Given two Hermitian, positive semidefinite operators Ai, A 2 : H — > Ti acting on 
the Hilbert space Ti and their sum A = jiAi + uA 2 with coefficients /i, v e R > 0, it then 
holds that 

ker(A) = ker^x) n ker(A 2 ) . 
For finite dimensional ranges this implies that 

rank(A) = rank(Ai) + rank(A 2 ) — dim ( Range(Ai) fl Range(A 2 ) ) . 

In other words: Adding two positive semidefinite, Hermitian operators will reduce the 
nullspace of the combined operator to that of the intersection of both nullspaces. The 
generalization to a finite sum of operators A = J2k=i a kAk with constant > is evident. 
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Neither positivity nor Hermiticity can be omitted. Without the former criterion, a 
counter example is A 2 = —Ai, with \i — v — \. As an example, without the latter criterion, 
the two M 3x3 operators 



1 1 
1 



A, 



^ 0^ 



1 
1 1 



=>• A 1 + A 2 



1 1 1 
1 1 1 



(Al) 



with ranks 3, 2, and 1 lead to the contradiction 1 = 3 + 2 — 2. 

PROOF: As both operators A\ and A 2 are Hermitian, they have diagonal representations 
with respect to their eigenvectors AiJAi^) = Ai^ | Ai^) and A 2 \X 2 j) = X 2 ,j \^2,j)- Without 
loss of generality we choose the normalized eigenvectors {lA^j}} as the basis of TL 

Clearly, Ti can be decomposed in the following two ways into orthogonal subspaces 



H = ker(Ai) © Range(Ai 



(A2) 



and also 



H = ker(A 2 ) © Range(A 2 ) . 



(A3) 



In a similar fashion we can partition the spectrum of Ai, and hence Ws basis, into all 
eigenvectors that form a basis of Range(Ai) and those that generate ker(Ai). Since 7i is 
a complete linear space and A\,A 2 ,A are linear operators, it is sufficient to consider the 
basis states only. For any such state \Xi 7 i) we find 



(A M |A|Ai,i) = ^(Ai )f |Ai|Ai,i) + ^(Ai,i|A2|A M ) 



(A4) 



where we define the mean Aj = (Ai,j| A 2 \Xi t i) = J2j \ (X2,j\Xi t i)\ 2 X 2 ,j > 0. This quantity is 
always positive (or zero) by virtue of A 2 being positive semidefinite. 

In accordance with the decomposition in Eqs.( [A2[ ) and ( |A^ ) four different cases are to 
be distinguished: 
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|Ai (i ) G Range(Ai) : 



|A M ) G ker(A x ; 



|Ai,i) £ ker(A 2 ) (A M | A|Ai,i) = //A M + i/A< > 

|A M ) G ker(A 2 ) (A M | A|Ai,i) = ^A M + > 

(A5) 

|Ai,i) £ ker(A 2 ) =>■ (Ai ti |A|Ai,i) = + i/A* > 



|A M ) G ker(A 2 ) (A M |A|A M ) = + 

Therefore only (basis) vectors that lie in both nullspaces will belong to the nullspace of A, 
which proves the first part of the Lemma. The second part follows from the linear algebraic 
dimension relation 

dim ( Range(Ai) + Range(A 2 ) ) 

= rank(Ai) + rank(A 2 ) - dim (Range(Ai) n Range(A 2 ) ) , (A6) 

where "+" on the lefthand side denotes all linear combinations of the vectors in both ranges. 

Now, any vector that lies either in Range(A!) or in Range(A 2 ) will, with an argument 
similar to Eq. (|X5|) , always be in Range(A). We are thus allowed to replace 

rank(yuAi + uA 2 ) = dim ( Range(Ai) + Range(A 2 ) ) , (A7) 

which completes our proof. 

The values of fi, v > are arbitrary, although often physical constraints may suggest 
that some specific values may be better than others (see the discussion in Section [I V A|) . 



We note that the lemma's first part could have been proved without using a basis. 
The decomposition Eq.(|A^) and the differentiation of Eq.( |A5|) into \<p) G ker(Ai) or \(f>) £ 
ker(Ai) for any \<p) G TC suffices. However, the second part of the lemma requires the basis 
vectors. 
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TABLE I. Characteristics of the initial wavepackets 



Configuration index 


x 




classical turning 


; points 








left 


right 


H 


1.75 


0.081 


-2.1563 


2.1534 


R 


0.9977 


0.055 


-2.0013 


1.9978 


T 


0.0052 


0.061 


-2.0403 


2.0370 


L 


-1.002 


0.054 


-1.9996 


1.9961 



The configuration indices H, R, T, and K corresponding to the locations of wavepacket 
initial positions are shown in Fig. |l|. All wavepackets start with equal width a = 0.05 and 
are initially at rest centered at the respective starting position xq. The average energy of 
each packet as well as the corresponding turning points of an equivalent classical particle of 
the same energy are given. 
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TABLE II. Inversion regularization information 



Configuration a\ x a Xb |Au| x 10 3 |As| x 10 

Hi 3.3xl0~ 5 -4.0 4.0 384.58 0.03 

H 1.0 -2.0 2.0 11.52 23.46 

R 0.033 -1.5 1.5 7.16 1.06 

T 0.007 -1.5 1.5 9.02 0.05 

L 0.033 -1.5 1.5 6.53 1.07 

S 100.0 -1.5 1.5 9.53 111.63 

LTRH 0.333 -1.5 1.5 3.83 12.42 

LTR 0.01 -1.5 1.5 2.78 0.70 

LT 0.01 -1.5 1.5 3.10 0.49 



In this numerical case study the optimal regularization parameter value a\ was identified 
by scanning its effect on the solution defect \Au\. The inversion domains are x a ^ x ^ x\>. 
The system defect is |As|. The first five rows apply to the individual PES reconstructions 
shown in Fig. [|, and the last four rows refer to measurement combinations shown in Fig. |5|. 
See the text for details. 
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FIG. 1. The substituted malonaldehyde model system with its corresponding one dimensional 
potential energy function as given in Eq.@. L, T, R, H indicate the different wavepacket initial 
positions utilized for the simulated inversions. 
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-4 -2 2 4 

x fa.u.l 




-4 -2 2 4 

x [a.u.l 

FIG. 2. Contour plots of the kernel matrices A. (a) configuration H and (b) configuration T. 
The numerical values for the matrix entries range from ~ 10 3 on the diagonal to ~ 10~ 8 on the 
boundaries. The contour levels correspond to: 1 (outer line), 31, 61, . . . , 211. 
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10 2 | — i 1 1 1 1 r 




FIG. 3. cti parameter scans performed with configuration H. Panels (a) and (b) display the 
solution defect |Au| with respect to two different inversion ranges: — 2 ^ x ^ 2 and —4 ^ x ^ 4, 
respectively. Panel (c) shows the system defect |As| for the entire domain — 4 ^ x ^ 4. 
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FIG. 4. 
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FIG. |j. Extractions of the potential under the conditions given in table ||. (a) the time 
evolution of the position average (x(t)) accompanied by the left- and righthand variance 
(i.e., shaded regions bounded by Eqs. (|I8|) and ([19])) to indicate the regions predominantly 
covered by the probability densities. The grey domains on the extreme left and right mark 
classically forbidden areas (cf. table |).(b) the reconstructed u(x) and the corresponding 
potential V(x) in (c) with a suitably chosen additive constant. For comparison the exact 
solutions are included as dashed lines. The individual curves have been offset for graphical 
reasons and the detailed presentation of V(x) is restricted to \x\ < 2.5 since the boundary 
regions will not be extracted correctly due to lack of data sampling there. 
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FIG. 5. Extraction of the PES for optimally combined (LT, LTR, and LTRH) as well as 
p-combined data (E). See the text and table || for details. The curves for the derivative u(x) 
in (a) and the PES in (b) have been offset for graphical clarity and exact solutions (dashed lines) 
added for comparison. For optimal combinations of the data the original and reconstructed PES 
are almost indistinguishable. 
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